首页> 外文OA文献 >Clustering high dimensional mixed data to uncover sub-phenotypes:joint analysis of phenotypic and genotypic data
【2h】

Clustering high dimensional mixed data to uncover sub-phenotypes:joint analysis of phenotypic and genotypic data

机译:聚类高维混合数据以揭示亚表型:联合   分析表型和基因型数据

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The LIPGENE-SU.VI.MAX study, like many others, recorded high dimensionalcontinuous phenotypic data and categorical genotypic data. LIPGENE-SU.VI.MAXfocuses on the need to account for both phenotypic and genetic factors whenstudying the metabolic syndrome (MetS), a complex disorder that can lead tohigher risk of type 2 diabetes and cardiovascular disease. Interest lies inclustering the LIPGENE-SU.VI.MAX participants into homogeneous groups orsub-phenotypes, by jointly considering their phenotypic and genotypic data, andin determining which variables are discriminatory. A novel latent variable model which elegantly accommodates high dimensional,mixed data is developed to cluster LIPGENE-SU.VI.MAX participants using aBayesian finite mixture model. A computationally efficient variable selectionalgorithm is incorporated, estimation is via a Gibbs sampling algorithm and anapproximate BIC-MCMC criterion is developed to select the optimal model. Two clusters or sub-phenotypes (`healthy' and `at risk') are uncovered. Asmall subset of variables is deemed discriminatory which notably includesphenotypic and genotypic variables, highlighting the need to jointly considerboth factors. Further, seven years after the LIPGENE-SU.VI.MAX data werecollected, participants underwent further analysis to diagnose presence orabsence of the MetS. The two uncovered sub-phenotypes strongly correspond tothe seven year follow up disease classification, highlighting the role ofphenotypic and genotypic factors in the MetS, and emphasising the potentialutility of the clustering approach in early screening. Additionally, theability of the proposed approach to define the uncertainty in sub-phenotypemembership at the participant level is synonymous with the concepts ofprecision medicine and nutrition.
机译:与许多其他研究一样,LIPGENE-SU.VI.MAX研究记录了高维连续表型数据和分类基因型数据。 LIPGENE-SU.VI.MAX在研究代谢综合征(MetS)时着重考虑表型和遗传因素,这是一种复杂的疾病,可能导致2型糖尿病和心血管疾病的风险更高。有趣的是,通过共同考虑他们的表型和基因型数据,并确定哪些变量具有歧视性,将LIPGENE-SU.VI.MAX参与者分为同质组或亚表型。开发了一种新颖的潜在变量模型,该模型很好地容纳了高维混合数据,从而使用贝叶斯有限混合模型对LIPGENE-SU.VI.MAX参与者进行了聚类。结合了计算有效的变量选择算法,通过吉布斯采样算法进行估计,并开发了近似的BIC-MCMC标准以选择最佳模型。发现了两个簇或亚表型(“健康”和“处于危险中”)。一小部分变量被认为具有歧视性,其中主要包括表型和基因型变量,这凸显了需要共同考虑这两个因素的必要性。此外,在收集LIPGENE-SU.VI.MAX数据七年后,对参与者进行了进一步分析,以诊断是否存在MetS。这两个未发现的亚表型与7年随访疾病分类高度对应,突出了表型和基因型因素在MetS中的作用,并强调了聚类方法在早期筛查中的潜在效用。另外,在参与者层面上定义亚表型成员不确定性的拟议方法的能力与精密医学和营养学的概念同义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号